Zihao Mao

Cameron Matson

9/22/2017

Lab 3: Images

Introduction

For this lab we examine the images of the Stanford Dog Dataset. The dataset consits of ~20,000 images of dogs from 120 different breeds.

Issues

The dataset is primarily used for fine-grained classification problems, meaning that the instances are all members of the same main class and are divided by subclass. In this case, the main class is 'Dog' and the subclass is the breed: 'Beagle', 'Poodle', 'Lab'... These are potentially more difficult than standard classification problems because in theory all members of teh main class should at least share similar features. In other words as the saying goes "a dog is a dog is a dog not a cat."

Another challenge with this dataset is that there is that they do not depict a standard scene. These are not faces of dogs. These are not photoshoot photos of dogs. The images in the dataset are not even exclusively of dogs. Some contain multiple dogs or even people. The dataset would benefit from preprocessing in the form of some sort of standardization such that all the images are of the same kind, using facial detection for instance.

Uses

We imagine one potential use for the finegrained classification of dogs could be used in searching for lost pets. Imagine poor Susan has lost her precious Bichon Frise, Tutu. She goes to her local police station and demands that they check all of the town's traffic cameras for traces of Tutu. Well, they say there's hours of footage, and we don't want to look at it. Poor Susan. Now suppose there is a program that will "watch" the video and recognize when there is a four legged animal in view. The image could then be put through a classifier to detect if that 4 legged beast is a dog or a cat (or something else). Hooray! It's a dog! Now the image is put through a fine-grained classifier, which is able to tell that the dog IS in fact a Bichon Frise and not a Yorkshire Terrier. The police are then able to determine where Tutu is and Susan is very happy.

Accuracy

How well does a system like that need to work? Well each successive level probably does not need to be as precise as the last (and it likely won't be cause each successive level is more difficult than the last.) The key point is that a human (with some knowledge of dog breeds) would be close to perfect at identifying dogs, but with thousands of street cameras around, it would take them a long time to go through all the footage. Assuming you do a good job of identifying the dogs in the image you probably don't have to be that accurate at identifing the bichon frise. As long as you have as few false negatives as possible (so that you don't miss a potential bichon) you could probably get away with a few false positives.

In [1]:
# first we need to relabel the folders

#import os

#imagedir = '../../data/dogs/Images'
#for f in os.listdir(imagedir):
#    if f[0] == '.': # stupid .DS_Store on mac
#        continue
#    if '-' in f:
#        name = f.split('-',2)[1]
#        os.renames(os.path.join(imagedir,f), os.path.join(imagedir,name))
#
#for f in os.listdir(imagedir):
#    print(f)
In [2]:
# lets rename the images so its more readable
#for breed in os.listdir(imagedir):
#    if breed[0] == '.': continue
#    for img in os.listdir(os.path.join(imagedir,breed)):
#        tail = img.split('_',2)[1]
#        name = breed+'_'+tail
#        os.rename(os.path.join(imagedir,breed,img), os.path.join(imagedir,breed,name))
In [3]:
import numpy as np
import os
import matplotlib.pyplot as plt
from scipy.misc import imresize
from skimage.color import rgb2gray
%matplotlib inline
imagedir = '../../data/dogs/Images'

Data Preprocessing

There are 120 different breeds included in the dataset with bout 150 images of each breed for a total of 20,580 images. The images are stored in directories by breed. To make the size of the dataset more managable, we'll take a sample of 50 images from each 60 of the breed.

In [4]:
# remove dsstore
for d in os.listdir(imagedir):
    if d.find('.DS') != -1:
        os.remove(os.path.join(imagedir,d))
        continue
    for f in os.listdir(os.path.join(imagedir, d)):
        if f.find('.DS') != -1:
            os.remove(os.path.join(imagedir,d,f))
    
In [30]:
def load_images(num_samples, num_classes, h, w):
    
    # preinitialize the matrix
    img_arr = np.empty((num_samples*num_classes,h*w))  # 20 instances of each breed, each img will be 200x200 = 40000 pixels
    label_arr = []
    i = 0
    
    # sample 60 breeds from the dataset
    a = np.arange(len(os.listdir(imagedir)))
    np.random.shuffle(a)
    breed_sample_idxs = a[:num_classes]
    for idx in breed_sample_idxs:
        breed = os.listdir(imagedir)[idx]
        if breed[0] == '.' : 
            continue # stupid ds.store on mac
        print(int(i/num_samples),breed)
        
        # sample 50 images from the breed
        b = np.arange(len(os.listdir(os.path.join(imagedir,breed))))
        np.random.shuffle(b)
        img_sample_idxs = b[:num_samples]
        for idx in img_sample_idxs:
            dog_path = os.path.join(imagedir,breed,os.listdir(os.path.join(imagedir,breed))[idx])            
            if (dog_path.find('.DS') != -1) : continue # stupid ds.store on mac

            img = plt.imread(dog_path)
            
            # converts image to gray, resizes it to be 200x200, and then linearizes it
            img_gray_resize_flat = rgb2gray(imresize(img, (h,w,3))).flatten()
                        
            img_arr[i] = img_gray_resize_flat
            i = i + 1

            # add name to list of labels
            fname = dog_path.split('/')[-1] # 'dog_name_123497.jpg'
            dog_name = fname[:fname.rfind('_')] # 'dog_name'
            label_arr.append(dog_name)
            
    return img_arr, label_arr
In [31]:
%%time
num_samples_per_breed = 50
num_breeds = 60
h=200
w=200
dogs, labels = load_images(num_samples=num_samples_per_breed, num_classes=num_breeds, h=h, w=w)
0 dhole
1 giant_schnauzer
2 soft
3 English_foxhound
4 Eskimo_dog
5 chow
6 briard
7 Airedale
8 cocker_spaniel
9 standard_schnauzer
10 Tibetan_mastiff
11 English_springer
12 Newfoundland
13 Japanese_spaniel
14 Sussex_spaniel
15 Weimaraner
16 Great_Dane
17 Sealyham_terrier
18 Siberian_husky
19 Cardigan
20 basenji
21 toy_terrier
22 Rottweiler
23 Ibizan_hound
24 Yorkshire_terrier
25 kelpie
26 Afghan_hound
27 clumber
28 Blenheim_spaniel
29 beagle
30 miniature_poodle
31 Brittany_spaniel
32 Brabancon_griffon
33 Bouvier_des_Flandres
34 Doberman
35 Irish_setter
36 miniature_pinscher
37 EntleBucher
38 Norwich_terrier
39 Irish_wolfhound
40 Saint_Bernard
41 Staffordshire_bullterrier
42 Irish_terrier
43 Norfolk_terrier
44 Irish_water_spaniel
45 Italian_greyhound
46 Pekinese
47 Scotch_terrier
48 golden_retriever
49 American_Staffordshire_terrier
50 Kerry_blue_terrier
51 Dandie_Dinmont
52 black
53 redbone
54 Chesapeake_Bay_retriever
55 malamute
56 Welsh_springer_spaniel
57 French_bulldog
58 groenendael
59 collie
Wall time: 19.5 s
In [7]:
import pandas as pd

X = pd.DataFrame(dogs)
X
Out[7]:
0 1 2 3 4 5 6 7 8 9 ... 39990 39991 39992 39993 39994 39995 39996 39997 39998 39999
0 0.090818 0.076531 0.081851 0.076806 0.052160 0.049072 0.058322 0.113797 0.289680 0.386039 ... 0.500904 0.405656 0.615961 0.843092 0.885664 0.897474 0.843092 0.701059 0.523175 0.580317
1 0.606644 0.433507 0.327304 0.321640 0.335867 0.417335 0.490102 0.496532 0.492045 0.496540 ... 0.117647 0.113725 0.101961 0.105882 0.109804 0.109804 0.101961 0.113725 0.113725 0.250980
2 0.217840 0.214506 0.218465 0.222425 0.212915 0.211516 0.170313 0.199982 0.228511 0.223168 ... 0.980392 0.949020 0.941176 0.992157 0.996078 0.996078 0.972549 0.952941 0.992157 0.992157
3 0.375890 0.348439 0.352361 0.371969 0.336675 0.324910 0.309774 0.343669 0.364393 0.386807 ... 0.417658 0.398051 0.382915 0.375072 0.373956 0.370034 0.372840 0.349593 0.342583 0.333601
4 0.621472 0.614462 0.609707 0.613629 0.617551 0.621472 0.615296 0.608286 0.596521 0.592600 ... 0.472261 0.415937 0.413476 0.464303 0.430521 0.417795 0.563074 0.373495 0.301195 0.255237
5 0.319490 0.331255 0.350863 0.360373 0.348608 0.342714 0.307420 0.308253 0.327861 0.385874 ... 0.213761 0.221054 0.240684 0.365364 0.301540 0.400986 0.373543 0.362902 0.321157 0.383902
6 0.011765 0.015686 0.019608 0.020173 0.021289 0.021855 0.014012 0.026059 0.029981 0.014577 ... 0.209608 0.209608 0.217451 0.221373 0.211557 0.211557 0.209608 0.199242 0.193654 0.189182
7 0.132177 0.154873 0.157143 0.113470 0.117429 0.108767 0.115197 0.119424 0.104906 0.074099 ... 0.255130 0.275259 0.260100 0.278845 0.310738 0.310456 0.277431 0.244973 0.284479 0.305761
8 0.038650 0.038650 0.030807 0.022964 0.022964 0.022964 0.022964 0.026885 0.034729 0.038650 ... 0.083877 0.078311 0.084465 0.084480 0.063771 0.090083 0.085291 0.086086 0.094503 0.102369
9 0.755611 0.731799 0.715830 0.723956 0.756459 0.790362 0.853404 0.901029 0.886191 0.860712 ... 0.587340 0.566899 0.568313 0.559354 0.573641 0.573076 0.572793 0.608370 0.605014 0.573091
10 0.334918 0.338274 0.339940 0.331815 0.317795 0.305747 0.303210 0.307131 0.334300 0.346898 ... 0.099906 0.099906 0.097629 0.090619 0.090619 0.092018 0.080818 0.081101 0.081101 0.081667
11 0.870573 0.870573 0.876734 0.880656 0.884577 0.880373 0.884012 0.883729 0.887651 0.888484 ... 0.388710 0.392632 0.388145 0.376380 0.380302 0.384223 0.376380 0.376380 0.376380 0.376380
12 0.087606 0.077508 0.076652 0.086467 0.094333 0.098560 0.097138 0.093745 0.093745 0.088990 ... 0.323174 0.314787 0.328784 0.318991 0.311148 0.310315 0.308365 0.296601 0.294078 0.299116
13 0.117915 0.125758 0.125758 0.113993 0.113993 0.125758 0.153209 0.161052 0.141444 0.127693 ... 0.447124 0.416585 0.366721 0.395005 0.422456 0.418535 0.414613 0.426378 0.418535 0.489123
14 0.885301 0.911881 0.935358 0.993578 0.992707 0.996346 0.999717 1.000000 1.000000 1.000000 ... 0.955577 0.961180 0.962884 0.954491 0.954491 0.958412 0.962051 0.958129 0.958412 0.950569
15 0.916093 0.916093 0.920014 0.923936 0.920014 0.920014 0.928125 0.932315 0.932315 0.932315 ... 0.861947 0.870341 0.854417 0.903560 0.988801 0.972258 0.963850 0.953827 0.952443 0.945984
16 0.117980 0.117980 0.117980 0.117980 0.117980 0.117980 0.121053 0.121604 0.129447 0.137290 ... 0.115117 0.122678 0.142285 0.224921 0.275902 0.142568 0.138647 0.177862 0.159921 0.161871
17 0.329868 0.345034 0.277854 0.328017 0.351249 0.364122 0.354895 0.273367 0.216187 0.184501 ... 0.298319 0.254570 0.183445 0.239494 0.432775 0.411754 0.313952 0.264622 0.302760 0.187985
18 0.354682 0.347665 0.249902 0.227771 0.344027 0.348796 0.299208 0.198073 0.284920 0.324702 ... 0.470320 0.470320 0.486007 0.442869 0.454634 0.478997 0.471154 0.424095 0.424095 0.439781
19 0.049031 0.052953 0.052953 0.052953 0.045109 0.037266 0.041188 0.045109 0.049031 0.052953 ... 0.510325 0.498560 0.478952 0.467187 0.475031 0.486795 0.482874 0.451501 0.439736 0.431893
20 0.204327 0.202355 0.218591 0.207377 0.215771 0.215488 0.226947 0.225809 0.225526 0.217965 ... 0.387047 0.402733 0.422318 0.398789 0.375260 0.379181 0.379181 0.379181 0.390113 0.389279
21 0.504655 0.500779 0.482532 0.360835 0.300216 0.306638 0.321796 0.328218 0.324526 0.327049 ... 0.272635 0.279407 0.275478 0.283321 0.288344 0.298687 0.307081 0.314358 0.318831 0.322469
22 0.058258 0.050415 0.058258 0.062180 0.054336 0.058258 0.062180 0.062180 0.062180 0.054336 ... 0.450084 0.469959 0.462667 0.428771 0.414201 0.436041 0.484247 0.486219 0.478658 0.463805
23 0.578066 0.593753 0.593753 0.589831 0.589831 0.602161 0.599356 0.607199 0.607764 0.607764 ... 0.189087 0.157134 0.141380 0.125895 0.111525 0.115662 0.137524 0.137845 0.118237 0.120209
24 0.588280 0.588280 0.588280 0.588280 0.588280 0.588280 0.588280 0.588280 0.588280 0.588280 ... 0.507199 0.499638 0.495166 0.532998 0.567742 0.575585 0.602195 0.632183 0.665528 0.685724
25 0.408751 0.459731 0.467574 0.479339 0.487182 0.475417 0.471496 0.486899 0.413781 0.386613 ... 0.375236 0.277747 0.336287 0.288335 0.282977 0.372087 0.264605 0.190125 0.188971 0.196762
26 0.262830 0.299500 0.246822 0.199733 0.154906 0.142025 0.187425 0.292482 0.267569 0.205649 ... 0.125156 0.156245 0.175571 0.152324 0.134963 0.154854 0.146728 0.169975 0.220955 0.189582
27 0.360219 0.360219 0.368062 0.375905 0.379827 0.379827 0.364140 0.368062 0.371984 0.379827 ... 0.121569 0.125490 0.127462 0.121569 0.122967 0.119611 0.155494 0.323593 0.463401 0.505437
28 0.087950 0.137852 0.165058 0.209045 0.213533 0.204298 0.199825 0.140436 0.087200 0.075687 ... 0.044842 0.035615 0.032236 0.014861 0.032236 0.052418 0.105646 0.119391 0.062784 0.023269
29 0.211302 0.211302 0.207380 0.215223 0.223066 0.226988 0.230909 0.230909 0.234831 0.234831 ... 0.093431 0.093431 0.089510 0.093431 0.089510 0.081667 0.082783 0.079695 0.075773 0.079695
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2970 0.598028 0.598028 0.598028 0.598028 0.601949 0.605871 0.605871 0.621557 0.621557 0.613714 ... 0.576076 0.525096 0.572155 0.513331 0.423135 0.423135 0.497645 0.603527 0.568233 0.481959
2971 0.861651 0.865573 0.873416 0.877337 0.885180 0.889102 0.896945 0.889102 0.842043 0.689102 ... 0.498629 0.588260 0.651266 0.668069 0.557974 0.535553 0.677272 0.674459 0.593505 0.634953
2972 0.445176 0.441299 0.268809 0.127357 0.181388 0.187512 0.271211 0.351027 0.336739 0.235627 ... 0.433930 0.444587 0.398644 0.439304 0.513814 0.433953 0.518285 0.457037 0.256875 0.229179
2973 0.749020 0.752941 0.760784 0.772549 0.780392 0.780392 0.784314 0.792157 0.796078 0.796078 ... 0.066667 0.070588 0.070588 0.066667 0.066384 0.080389 0.084027 0.126882 0.464985 0.733333
2974 0.161090 0.165011 0.168933 0.172855 0.172855 0.172855 0.165845 0.150992 0.123541 0.100011 ... 0.544533 0.548454 0.548454 0.540611 0.556297 0.540611 0.517082 0.513160 0.509238 0.493552
2975 0.127154 0.130510 0.138353 0.133048 0.112354 0.117161 0.113545 0.113262 0.113262 0.109341 ... 0.431442 0.419395 0.419089 0.394168 0.369812 0.325015 0.334257 0.313816 0.330909 0.332598
2976 0.139167 0.183986 0.281758 0.359348 0.427145 0.463288 0.476742 0.486275 0.494118 0.498039 ... 0.579827 0.508956 0.547033 0.525758 0.521836 0.591591 0.555464 0.565279 0.539510 0.549585
2977 0.305216 0.288130 0.201036 0.122924 0.098889 0.115178 0.116853 0.124131 0.132539 0.137309 ... 0.234655 0.287823 0.222080 0.252612 0.203604 0.286820 0.348464 0.205896 0.418175 0.226597
2978 0.424695 0.437278 0.278414 0.307531 0.353749 0.309220 0.295231 0.331366 0.323218 0.338621 ... 0.528576 0.510069 0.505046 0.505329 0.516245 0.507799 0.332660 0.304874 0.288920 0.268493
2979 0.329915 0.332170 0.348162 0.289376 0.280729 0.277671 0.280706 0.332758 0.383991 0.399112 ... 0.141444 0.157131 0.164974 0.157131 0.153209 0.106150 0.129680 0.157131 0.168895 0.149287
2980 0.120683 0.093232 0.136369 0.152055 0.136369 0.136369 0.132447 0.120683 0.128809 0.111456 ... 0.268766 0.253645 0.256466 0.248072 0.229595 0.221484 0.200485 0.186480 0.182841 0.182558
2981 0.542098 0.553863 0.541533 0.516605 0.476824 0.441529 0.467574 0.512110 0.480737 0.504267 ... 0.411883 0.428915 0.366453 0.461404 0.477633 0.452980 0.385495 0.483006 0.432562 0.414888
2982 0.125223 0.132256 0.135939 0.134845 0.137636 0.135902 0.134793 0.131742 0.129502 0.129502 ... 0.145412 0.141215 0.144258 0.150151 0.132538 0.147391 0.139265 0.152414 0.139258 0.132263
2983 0.094140 0.101983 0.101149 0.109543 0.101417 0.105339 0.111783 0.120460 0.119061 0.115139 ... 0.492376 0.484533 0.487338 0.484533 0.492376 0.478639 0.484227 0.487010 0.488126 0.497636
2984 0.128075 0.128916 0.288577 0.414909 0.362812 0.334811 0.375135 0.331700 0.291055 0.419879 ... 0.574775 0.357468 0.440671 0.533375 0.433654 0.470347 0.432262 0.477647 0.407319 0.401136
2985 0.366795 0.372115 0.377153 0.376580 0.374890 0.359747 0.354969 0.353280 0.347669 0.337020 ... 0.571653 0.506965 0.654021 0.695454 0.699070 0.667942 0.655307 0.673210 0.631740 0.603723
2986 0.034729 0.038650 0.034729 0.034729 0.034729 0.030807 0.027719 0.027719 0.031640 0.035562 ... 0.182679 0.183833 0.176280 0.167887 0.163697 0.155586 0.147475 0.136276 0.135198 0.131582
2987 0.893850 0.897771 0.901693 0.901693 0.901693 0.901693 0.905615 0.897771 0.897771 0.905615 ... 0.824110 0.812345 0.815984 0.843435 0.815984 0.815984 0.815984 0.843435 0.835041 0.818522
2988 0.476152 0.480073 0.487916 0.491838 0.487916 0.491838 0.495760 0.487916 0.499681 0.499681 ... 0.823232 0.831625 0.805008 0.763255 0.763255 0.767444 0.776120 0.790155 0.797165 0.808929
2989 0.246350 0.250271 0.258115 0.254193 0.258115 0.273801 0.297881 0.298431 0.286667 0.298431 ... 0.551132 0.523658 0.549160 0.521709 0.479687 0.456991 0.465668 0.483304 0.470988 0.461761
2990 0.138010 0.137727 0.148927 0.152848 0.170775 0.178335 0.187867 0.206076 0.216442 0.215594 ... 0.779097 0.596344 0.530243 0.618878 0.753694 0.784776 0.740209 0.727916 0.731838 0.748908
2991 0.027719 0.027719 0.035562 0.039484 0.039484 0.039484 0.041441 0.038635 0.042840 0.049261 ... 0.287627 0.266971 0.262349 0.261746 0.288579 0.310672 0.347618 0.281806 0.269558 0.182540
2992 0.081467 0.077545 0.073624 0.073624 0.069702 0.077545 0.077545 0.077545 0.081467 0.077545 ... 0.102844 0.098051 0.096667 0.093296 0.103357 0.115940 0.118157 0.118425 0.111981 0.102754
2993 0.769556 0.769556 0.769556 0.773478 0.777399 0.781321 0.776260 0.774288 0.778210 0.778210 ... 0.135680 0.097945 0.093198 0.099925 0.097976 0.097976 0.096309 0.098564 0.098282 0.094360
2994 0.605940 0.609861 0.613783 0.613783 0.613783 0.613783 0.617704 0.617704 0.617704 0.617704 ... 0.250683 0.301380 0.328831 0.285694 0.254322 0.273929 0.277851 0.270008 0.277851 0.226871
2995 0.340335 0.336414 0.344257 0.359943 0.379551 0.383473 0.391316 0.391316 0.395237 0.403080 ... 0.131659 0.127737 0.110369 0.063311 0.063311 0.078997 0.094683 0.102526 0.098605 0.082085
2996 0.031373 0.027451 0.027451 0.023529 0.031373 0.023529 0.019608 0.023529 0.031373 0.031373 ... 0.458748 0.514773 0.427895 0.435455 0.391492 0.430448 0.438291 0.383649 0.404082 0.335726
2997 0.793141 0.761768 0.793141 0.851964 0.895102 0.883337 0.891180 0.899023 0.895667 0.895667 ... 0.418023 0.498158 0.490881 0.454708 0.489391 0.498878 0.532758 0.487962 0.465029 0.552733
2998 0.592260 0.664820 0.601547 0.625665 0.611951 0.608312 0.519515 0.512482 0.558402 0.535171 ... 0.578180 0.516069 0.521962 0.468742 0.521977 0.584172 0.589775 0.561773 0.494556 0.421162
2999 0.103480 0.127843 0.196482 0.238786 0.258394 0.276029 0.097771 0.300087 0.343658 0.350325 ... 0.428729 0.420886 0.397074 0.389513 0.483080 0.458167 0.453695 0.464076 0.456783 0.431572

3000 rows × 40000 columns

In [8]:
ex = dogs[0].reshape((200,200))
plt.imshow(ex, cmap='gray')
plt.title(labels[0])
plt.show()
In [9]:
# taken from Class Demo #4
def plot_gallery(images, titles, h, w, n_row=3, n_col=6):
    """Helper function to plot a gallery of portraits"""
    plt.figure(figsize=(1.7 * n_col, 2.3 * n_row))
    plt.subplots_adjust(bottom=0, left=.01, right=1.5, top=.90, hspace=.35)
    
    # with slight modification
    sample = np.random.randint(low=0, high=images.shape[0], size=n_row*n_col)
    
    for i, idx in enumerate(sample):
        plt.subplot(n_row, n_col, i + 1)
        plt.imshow(images[idx].reshape((h, w)), cmap=plt.cm.gray)
        plt.title(titles[idx], size=12)
        plt.xticks(())
        plt.yticks(())
In [10]:
plot_gallery(dogs, labels, 200, 200) # defaults to showing a 3 by 6 subset of the faces

Aren't they cute? The answer is yes. They are.

Linear Dimensionality Reduction

Full PCA

First, lets' find the maixum number of principle components required to have improvement on explained variance ratio.

In [11]:
from sklearn.decomposition import PCA
h = 200
w = 200

n_components = 3000
print ("Extracting the top %d eigenfaces from %d faces" % (
    n_components, dogs.shape[0]))
pca = PCA(n_components=n_components)
%time pca.fit(dogs.copy())
eigenfaces = pca.components_.reshape((n_components, h, w))
Extracting the top 3000 eigenfaces from 3000 faces
Wall time: 45.3 s
In [12]:
def plot_explained_variance(pca):
    import plotly
    from plotly.graph_objs import Scatter, Marker, Layout, XAxis, YAxis, Bar, Line
    plotly.offline.init_notebook_mode() # run at the start of every notebook
    
    explained_var = pca.explained_variance_ratio_
    cum_var_exp = np.cumsum(explained_var)
    
    plotly.offline.iplot({
        "data": [Bar(y=explained_var, name='individual explained variance'),
                 Scatter(y=cum_var_exp, name='cumulative explained variance')
            ],
        "layout": Layout(xaxis=XAxis(title='Principal components'), yaxis=YAxis(title='Explained variance ratio'))
    })
      
In [13]:
plot_explained_variance(pca)

According to the graph above, the number of components greater than 1000 will not have effective imporvement on the explained variance ratio anymore. Therefore, we decieded to take 1000 as the number of components for the PCA.

In [14]:
n_components = 1000
print ("Extracting the top %d eigenfaces from %d faces" % (
    n_components, dogs.shape[0]))
pca = PCA(n_components=n_components)
%time pca.fit(dogs.copy())
eigenfaces = pca.components_.reshape((n_components, h, w))
Extracting the top 1000 eigenfaces from 3000 faces
Wall time: 32.4 s
In [15]:
eigenface_titles = ["eigenface %d" % i for i in range(eigenfaces.shape[0])]
In [16]:
# taken from Class Demo #4
def plot_gallery(images, titles, h, w, n_row=3, n_col=6):
    """Helper function to plot a gallery of portraits"""
    plt.figure(figsize=(1.7 * n_col, 2.3 * n_row))
    plt.subplots_adjust(bottom=0, left=.01, right=1.5, top=.90, hspace=.35)
    
    sample = np.arange(n_row*n_col)
    
    for i, idx in enumerate(sample):
        plt.subplot(n_row, n_col, i + 1)
        plt.imshow(images[idx].reshape((h, w)), cmap=plt.cm.gray)
        plt.title(titles[idx], size=12)
        plt.xticks(())
        plt.yticks(())
In [17]:
plot_gallery(eigenfaces, eigenface_titles, h, w)

Then let's see how the images would look like after reconstructed from PCA.

In [18]:
# taken from Class Demo #4
def reconstruct_image(trans_obj,org_features):
    low_rep = trans_obj.transform(org_features)
    rec_image = trans_obj.inverse_transform(low_rep)
    return low_rep, rec_image
In [19]:
dogs_to_reconstruct = 1    
dogs_idx = dogs[dogs_to_reconstruct]
low_dimensional_representation, reconstructed_image = reconstruct_image(pca,dogs_idx.reshape(1, -1))
In [23]:
plt.subplot(1,3,1)
plt.imshow(dogs_idx.reshape((h, w)), cmap=plt.cm.gray)
plt.title('Original')
plt.grid()
plt.subplot(1,2,2)
plt.imshow(reconstructed_image.reshape((h, w)), cmap=plt.cm.gray)
plt.title('Reconstructed from Full PCA')
plt.grid()

As we can see friom the comparion, although the reconstructed image is less clear, it's very close to the orginal image. 1000 components has covered around 97% of the variance of the overall image dataset.

Randomized PCA

Lets make another try with random PCA which makes PCA of randomly selected samples, and make an comparison.

In [21]:
n_components = 1000
print ("Extracting the top %d eigenfaces from %d faces" % (
    n_components, dogs.shape[0]))

rpca = PCA(n_components=n_components,svd_solver='randomized')
%time rpca.fit(dogs.copy())
eigenfaces = rpca.components_.reshape((n_components, h, w))
Extracting the top 1000 eigenfaces from 3000 faces
Wall time: 32.3 s
In [22]:
eigenface_titles = ["eigenface %d" % i for i in range(eigenfaces.shape[0])]
plot_gallery(eigenfaces, eigenface_titles, h, w)
In [24]:
dogs_to_reconstruct = 1    
dogs_idx = dogs[dogs_to_reconstruct]
low_dimensional_representation, reconstructed_image_random = reconstruct_image(rpca,dogs_idx.reshape(1, -1))
low_dimensional_representation, reconstructed_image = reconstruct_image(pca,dogs_idx.reshape(1, -1))

plt.figure(figsize=(1.7 * 3, 2.3 * 1))
plt.subplots_adjust(bottom=0, left=.01, right=1.5, top=.90, hspace=.35)
    
#origin
plt.subplot(1,3,1)
plt.imshow(dogs_idx.reshape((h, w)), cmap=plt.cm.gray)
plt.title('Original')
plt.grid()
#Full PCA
plt.subplot(1,3,2)
plt.imshow(reconstructed_image.reshape((h, w)), cmap=plt.cm.gray)
plt.title(' Full PCA')
plt.grid()
#Random PCA
plt.subplot(1,3,3)
plt.imshow(reconstructed_image.reshape((h, w)), cmap=plt.cm.gray)
plt.title(' Random PCA')
plt.grid()

As we can see, the reconstructed images are very similar since they are using the same PCA approach.

Non-linear Dimensionality Reduction

Kernal PCA

Since the implementation of kernal PCA takes too long for the whole dataset, we choose to implement it only for a subgroud of the data, the 1000 images instead of 3000.We will take 20 images from each 50 classes.

In [32]:
dogs_sub, labels_sub = load_images(num_samples=20, num_classes=50, h=h, w=w)
0 Japanese_spaniel
1 Newfoundland
2 Leonberg
3 African_hunting_dog
4 Bouvier_des_Flandres
5 beagle
6 Border_collie
7 Siberian_husky
8 Staffordshire_bullterrier
9 Cardigan
10 cairn
11 bull_mastiff
12 Boston_bull
13 Italian_greyhound
14 boxer
15 Bedlington_terrier
16 Pembroke
17 Ibizan_hound
18 collie
19 Bernese_mountain_dog
20 Tibetan_terrier
21 cocker_spaniel
22 curly
23 Great_Dane
24 Irish_terrier
25 Mexican_hairless
26 clumber
27 German_short
28 Shih
29 malamute
30 komondor
31 bluetick
32 Appenzeller
33 Afghan_hound
34 English_setter
35 Irish_water_spaniel
36 Saint_Bernard
37 silky_terrier
38 Yorkshire_terrier
39 Blenheim_spaniel
40 kelpie
41 Sussex_spaniel
42 Great_Pyrenees
43 Eskimo_dog
44 Rhodesian_ridgeback
45 Welsh_springer_spaniel
46 Saluki
47 Australian_terrier
48 briard
49 groenendael
In [33]:
dogs_sub.shape
Out[33]:
(1000, 40000)
In [34]:
%%time
from sklearn.decomposition import KernelPCA

n_components = 500
print ("Extracting the top %d eigenfaces from %d faces" % (n_components, dogs_sub.shape[0]))

kpca = KernelPCA(n_components=n_components, kernel='rbf', 
                fit_inverse_transform=True, gamma=15, # very sensitive to the gamma parameter,
                remove_zero_eig=True)  
kpca.fit(dogs_sub.copy())
Extracting the top 500 eigenfaces from 1000 faces
Wall time: 4min 1s

To make a comparision with linear dimensionality reduction, lets also do a regular full PCA with the subgroup.

In [36]:
n_components = 1000
print ("Extracting the top %d eigenfaces from %d faces" % (
    n_components, dogs.shape[0]))
pca = PCA(n_components=n_components)
%time pca.fit(dogs_sub.copy())
Extracting the top 1000 eigenfaces from 3000 faces
Wall time: 5.36 s
Out[36]:
PCA(copy=True, iterated_power='auto', n_components=1000, random_state=None,
  svd_solver='auto', tol=0.0, whiten=False)
In [37]:
plot_explained_variance(pca)

It seems like the PCA will approch the maximum performance around 600 number of components. In order to see the difference between the performance, we need a smaller number. In this case, we choose 500 to better see the difference.

In [39]:
n_components = 500
print ("Extracting the top %d eigenfaces from %d faces" % (
    n_components, dogs.shape[0]))
pca = PCA(n_components=n_components)
%time pca.fit(dogs_sub.copy())
Extracting the top 500 eigenfaces from 3000 faces
Wall time: 8.11 s
Out[39]:
PCA(copy=True, iterated_power='auto', n_components=500, random_state=None,
  svd_solver='auto', tol=0.0, whiten=False)
In [40]:
# taken from Class Demo #4
import warnings
warnings.simplefilter('ignore', DeprecationWarning)

from ipywidgets import widgets  


def plt_reconstruct(dogs_to_reconstruct):
    
    reconstructed_image = pca.inverse_transform(pca.transform(dogs_sub[dogs_to_reconstruct].reshape(1, -1)))
    reconstructed_image_kpca = kpca.inverse_transform(kpca.transform(dogs_sub[dogs_to_reconstruct].reshape(1, -1)))
    
    
    plt.figure(figsize=(15,7))
    
    plt.subplot(1,3,1)
    plt.imshow(dogs_sub[dogs_to_reconstruct].reshape((h, w)), cmap=plt.cm.gray)
    plt.title("Origin")
    plt.grid()
    
    plt.subplot(1,3,2)
    plt.imshow(reconstructed_image.reshape((h, w)), cmap=plt.cm.gray)
    plt.title('Full PCA with 500 n_comp')
    plt.grid()
    
    plt.subplot(1,3,3)
    plt.imshow(reconstructed_image_kpca.reshape((h, w)), cmap=plt.cm.gray)
    plt.title('Kernel PCA with 500 n_comp')
    plt.grid()
    
widgets.interact(plt_reconstruct,dogs_to_reconstruct=100,__manual=True)
Out[40]:
<function __main__.plt_reconstruct>

According to the comparsion above, with the same number of components, kernal PCA would do a much better job than regular full PCA. The image quality of reconstructed by kernal PCA is close to the origin, while the images reconsturcted from regular PCA is still unclear in some cases. The time comparision now is 4 minutes of kernal PCA and 8 seconds of regular PCA.

Let's see the comparison when we raise the number of components to maximum for regular PCA.

In [42]:
n_components = 1000
print ("Extracting the top %d eigenfaces from %d faces" % (
    n_components, dogs_sub.shape[0]))
pca = PCA(n_components=n_components)
%time pca.fit(dogs_sub.copy())
Extracting the top 1000 eigenfaces from 1000 faces
Wall time: 5.39 s
Out[42]:
PCA(copy=True, iterated_power='auto', n_components=1000, random_state=None,
  svd_solver='auto', tol=0.0, whiten=False)
In [43]:
def plt_reconstruct_max(dogs_to_reconstruct):
    
    reconstructed_image = pca.inverse_transform(pca.transform(dogs_sub[dogs_to_reconstruct].reshape(1, -1)))
    reconstructed_image_kpca = kpca.inverse_transform(kpca.transform(dogs_sub[dogs_to_reconstruct].reshape(1, -1)))
    
    
    plt.figure(figsize=(15,7))
    
    plt.subplot(1,3,1)
    plt.imshow(dogs_sub[dogs_to_reconstruct].reshape((h, w)), cmap=plt.cm.gray)
    plt.title("Origin")
    plt.grid()
    
    plt.subplot(1,3,2)
    plt.imshow(reconstructed_image.reshape((h, w)), cmap=plt.cm.gray)
    plt.title('Full PCA with 1000 n_comp')
    plt.grid()
    
    plt.subplot(1,3,3)
    plt.imshow(reconstructed_image_kpca.reshape((h, w)), cmap=plt.cm.gray)
    plt.title('Kernel PCA with 200 n_comp')
    plt.grid()
    
widgets.interact(plt_reconstruct_max,dogs_to_reconstruct=100,__manual=True)
Out[43]:
<function __main__.plt_reconstruct_max>

For some reason, the implementation takes less time for regular PCA to reach it's maximum performance. The kernal PCA requires less number of components, but it's also taking much more time for processing. It takes 4 minutes for kernal PCA to reach the wanted performance while the regular full PCA only needs 5 seconds. Therefore, we prefer regular full PCA to do the dimensionality reduction since it takes less time. It took more than an hour for kernal PCA's implementation when we was trying to apply that for the 3000 images.

Feature Extraction

Gradient

Let's start by doing a simple edge detection using the gradient (a.k.a a sobel filter)

In [11]:
from skimage.filters import sobel_h, sobel_v

idx_to_reconstruct = int(np.random.rand(1)*len(dogs))
img  = dogs[idx_to_reconstruct].reshape((h,w))

plt.figure(figsize=(15,30))
plt.subplot(1,4,1)
plt.imshow(img, cmap='gray')
plt.title(labels[idx_to_reconstruct]+' - original')

plt.subplot(1,4,2)
plt.imshow(sobel_v(img,), cmap='gray')
plt.title('v.sobel filter')

plt.subplot(1,4,3)
plt.imshow(sobel_h(img), cmap='gray')
plt.title('h.sobel filter')

plt.subplot(1,4,4)
gradient_mag = np.sqrt(sobel_v(img)**2 + sobel_h(img)**2 ) 
plt.imshow(gradient_mag, cmap='gray')
plt.title('gradient [v^2+h^2]^1/2')
plt.show()

Let's take the gradient of each image in the dataset and see if we can use it to classify the breed. Or at least get similar looking images...

In [12]:
def take_gradient(row, shape):
    img = row.reshape(shape)
    gradient_mag = np.sqrt(sobel_v(img)**2 + sobel_h(img)**2 ) 
    return gradient_mag.reshape(-1)
# case
%time take_gradient(dogs[0], ((h,w))).shape
Wall time: 2.01 ms
Out[12]:
(40000,)
In [13]:
%time grad_features = np.apply_along_axis(take_gradient, 1, dogs, (h,w))
print(grad_features.shape)
Wall time: 2.51 s
(3000, 40000)

Let's take a quick look at some of these

In [14]:
plot_gallery(grad_features, labels, h, w) 

It seems to be a pretty good edge detector, but because 1) there is a lot of noise in the images and 2) the dogs are in many different poses it probably isn't very good as a feature for classifying

In [15]:
from sklearn.metrics.pairwise import pairwise_distances
# find the pairwise distance between all the different image features
%time dist_matrix = pairwise_distances(grad_features)
Wall time: 3.2 s
In [16]:
import copy
# find closest image to current image
idx1 = np.random.randint(0,len(dogs))
distances = copy.deepcopy(dist_matrix[idx1,:])
distances[idx1] = np.infty # dont pick the same image!
idx2 = np.argmin(distances)

plt.figure(figsize=(10,10))
plt.subplot(2,2,1)
plt.imshow(dogs[idx1].reshape((h,w)), cmap='gray')
plt.title("Original Image - " + labels[idx1])

plt.subplot(2,2,2)
plt.imshow(dogs[idx2].reshape((h,w)), cmap='gray')
plt.title("Closest Image - " + labels[idx2])

plt.subplot(2,2,3)
plt.imshow(grad_features[idx1].reshape((h,w)), cmap='gray')
plt.title("Original Image gradient")

plt.subplot(2,2,4)
plt.imshow(grad_features[idx2].reshape((h,w)), cmap='gray')
plt.title("Closest Image gradient")

plt.show()

This method doesn't work very well for this dataset since its extremely sensitive position of the object in the image. If two images are very "close" to one another using this method it is more likely that subject of the image are in similar positions, rather than the subjects being similar to one another. For example consider the match from one iteration below:

image.png

No one would ever mistake a miniature schnauzer for an Irish setter. However, in these particular images the two dogs are both forward facing, approximately the same size relative to the size of the image, and pictured with a grassy background. Thus, by their gradients, the images are similar.

This method is essentially the same as a pixel wise comparisson. If an edge (i.e. high gradient intensity) occurs in one pixel in image A and appears just one pixel over in image B, the distance (as computed by

We can illustrate this by looking at a heatmap of the pairwise distance of the gradients:

In [17]:
import seaborn as sns

plt.figure(figsize=(10,9))
ax = sns.heatmap(dist_matrix[:200,:200], cmap='magma')

ax.set_xticks(np.arange(0,200,50))
ax.set_xticks(np.arange(0,200,10), minor=True)
ax.set_yticks(np.arange(0,200,50))
ax.set_yticks(np.arange(0,200,10), minor=True)

ax.set_xticklabels([*labels[0:200:50]])
ax.set_xticklabels(np.arange(0,200,10), minor=True)
ax.set_yticklabels([*labels[0:200:50]])
ax.set_yticklabels(np.arange(0,200,10), minor=True)

ax.grid(markevery=5, lw=4,color='black')

ax.set_title('Pariwise Distance of Gradient by Class')

plt.show()

This heatmap shows the pairwise distance between the instances of the first four breeds in the dataset. If minimizing the distance in the gradient were any good as a classifier, one would expect there to be significantly darker colors within each major square along the diagnol vs. the rest of the grid. Since each square in the 4x4 grid has roughly the same distribution of colors, we can conclude that this is not a good classifier.

What if we look at those gradients by class.

In [18]:
plt.figure(figsize=(15,25))

plt.subplot(1,5,1)
breedA = grad_features[0:50]
average_breedA = np.apply_along_axis(func1d=np.mean, arr=breedA, axis=0).reshape((h,w))
plt.imshow(average_breedA)
plt.title(labels[0])

plt.subplot(1,5,2)
breedB = grad_features[51:100]
average_breedB = np.apply_along_axis(func1d=np.mean, arr=breedB, axis=0).reshape((h,w))
plt.imshow(average_breedB)
plt.title(labels[51])

plt.subplot(1,5,3)
breedC = grad_features[101:150]
average_breedC = np.apply_along_axis(func1d=np.mean, arr=breedC, axis=0).reshape((h,w))
plt.imshow(average_breedC)
plt.title(labels[101])

plt.subplot(1,5,4)
breedD = grad_features[151:200]
average_breedD = np.apply_along_axis(func1d=np.mean, arr=breedD, axis=0).reshape((h,w))
plt.imshow(average_breedD)
plt.title(labels[151])

plt.subplot(1,5,5)
allFour = grad_features[:200]
average_allFour = np.apply_along_axis(func1d=np.median, arr=allFour, axis=0).reshape((h,w))
plt.imshow(average_allFour)
plt.title('All Four')
Out[18]:
<matplotlib.text.Text at 0x1763f1ee2b0>

There clearly isn't much congruence among the images, which we knew already. When we average the gradient over a class, it essentially looks like white noise, not much diffrent from when we take the average gradient over all four of them.

Ordered Gradient

In [19]:
from skimage.feature import hog
# lets first visualize what the daisy descripto looks like
features, img_desc = hog(image=img, block_norm='L2-Hys',visualise=True)

plt.figure(figsize=(10,10))
plt.subplot(1,2,1)
plt.imshow(img)
plt.subplot(1,2,2)
plt.imshow(img_desc, cmap='gray')
plt.grid()
In [20]:
def apply_hog(row,shape):
    feat = hog(row.reshape(shape), block_norm='L2-Hys')
    return feat.reshape((-1))

%time test_feature = apply_hog(dogs[3],(h,w))
test_feature.shape
Wall time: 19.1 ms
Out[20]:
(42849,)
In [21]:
# apply to entire data, row by row,
%time hog_features = np.apply_along_axis(apply_hog, 1, dogs, (h,w))
print(hog_features.shape)
Wall time: 41.2 s
(3000, 42849)
In [22]:
from sklearn.metrics.pairwise import pairwise_distances
# find the pairwise distance between all the different image features
%time dist_matrix = pairwise_distances(hog_features)
Wall time: 3.46 s
In [23]:
import copy
# find closest image to current image
idx1 = np.random.randint(len(dogs))
distances = copy.deepcopy(dist_matrix[idx1,:])
distances[idx1] = np.infty # dont pick the same image!
idx2 = np.argmin(distances)

plt.figure(figsize=(7,10))
plt.subplot(2,2,1)
plt.imshow(dogs[idx1].reshape((h,w)), cmap='gray')
plt.title("Original Image - "+labels[idx1])
plt.grid()

plt.subplot(2,2,2)
plt.imshow(dogs[idx2].reshape((h,w)), cmap='gray')
plt.title("Closest Image - "+labels[idx2])
plt.grid()
In [24]:
plt.figure(figsize=(10,9))
ax = sns.heatmap(dist_matrix[:200,:200], cmap='magma')

ax.set_xticks(np.arange(0,200,50))
ax.set_xticks(np.arange(0,200,10), minor=True)
ax.set_yticks(np.arange(0,200,50))
ax.set_yticks(np.arange(0,200,10), minor=True)

ax.set_xticklabels([*labels[0:200:50]])
ax.set_xticklabels(np.arange(0,200,10), minor=True)
ax.set_yticklabels([*labels[0:200:50]])
ax.set_yticklabels(np.arange(0,200,10), minor=True)

ax.grid(markevery=5, lw=4,color='black')

ax.set_title('Pariwise Distance of HOG by Class')

plt.show()

DAISY

Let's see if using the DAISY method is any more effective as a classifier.

In [25]:
from skimage.feature import daisy
# lets first visualize what the daisy descripto looks like
features, img_desc = daisy(img,step=40, radius=10, rings=3, histograms=5, orientations=8, visualize=True)
plt.imshow(img_desc, cmap='gray')
plt.grid()
In [26]:
# create a function to tak in the row of the matric and return a new feature
def apply_daisy(row,shape):
    feat = daisy(row.reshape(shape),step=10, radius=10, rings=2, histograms=6, orientations=8, visualize=False)
    return feat.reshape((-1))

%time test_feature = apply_daisy(dogs[3],(h,w))
test_feature.shape
Wall time: 57.2 ms
Out[26]:
(33696,)
In [27]:
# apply to entire data, row by row,
# takes about a minute to run
%time daisy_features = np.apply_along_axis(apply_daisy, 1, dogs, (h,w))
print(daisy_features.shape)
Wall time: 2min 10s
(3000, 33696)
In [28]:
from sklearn.metrics.pairwise import pairwise_distances
# find the pairwise distance between all the different image features
%time dist_matrix = pairwise_distances(daisy_features)
Wall time: 2.71 s
In [29]:
import copy
# find closest image to current image
idx1 = np.random.randint(len(dogs))
distances = copy.deepcopy(dist_matrix[idx1,:])
distances[idx1] = np.infty # dont pick the same image!
idx2 = np.argmin(distances)

plt.figure(figsize=(7,10))
plt.subplot(1,2,1)
plt.imshow(dogs[idx1].reshape((h,w)), cmap='gray')
plt.title("Original Image - "+labels[idx1])
plt.grid()

plt.subplot(1,2,2)
plt.imshow(dogs[idx2].reshape((h,w)), cmap='gray')
plt.title("Closest Image - "+labels[idx2])
plt.grid()

Hey! It actually got one right!

image.png

However these two images are of the curly in VERY similar poses

Now this is interesting

image.png

It looks like the outline of the branches is most similar to this very long haired dog.

On inspection it looks like this does at least a little better than the gradient method, but it still seems to be working on the context of the image more than the content in it, i.e. the potition and orientation of the dog rather than the characteristics of the dog itself

In [30]:
plt.figure(figsize=(10,9))
ax = sns.heatmap(dist_matrix[:200,:200], cmap='magma')

ax.set_xticks(np.arange(0,200,50))
ax.set_xticks(np.arange(0,200,10), minor=True)
ax.set_yticks(np.arange(0,200,50))
ax.set_yticks(np.arange(0,200,10), minor=True)

ax.set_xticklabels([*labels[0:200:50]])
ax.set_xticklabels(np.arange(0,200,10), minor=True)
ax.set_yticklabels([*labels[0:200:50]])
ax.set_yticklabels(np.arange(0,200,10), minor=True)

ax.grid(markevery=5, lw=4,color='black')

ax.set_title('Pariwise Distance of DAISY by Class')

plt.show()

When we look at the same heatmap as before we see that overall the distances are reduced (there seems to also be a factor of 10 in the daisy computation) but again there is no reall difference between classes.